127 research outputs found

    Motif-All: discovering all phosphorylation motifs

    Get PDF
    Background: Phosphorylation motifs represent common patterns around the phosphorylation site. The discovery of such kinds of motifs reveals the underlying regulation mechanism and facilitates the prediction of unknown phosphorylation event. To date, people have gathered large amounts of phosphorylation data, making it possible to perform substrate-driven motif discovery using data mining techniques. Results: We describe an algorithm called Motif-All that is able to efficiently identify all statistically significant motifs. The proposed method explores a support constraint to reduce search space and avoid generating random artifacts. As the number of phosphorylated peptides are far less than that of unphosphorylated ones, we divide the mining process into two stages: The first step generates candidates from the set of phosphorylated sequences using only support constraint and the second step tests the statistical significance of each candidate using the odds ratio derived from the whole data set. Experimental results on real data show that Motif-All outperforms current algorithms in terms of both effectiveness and efficiency. Conclusions: Motif-All is a useful tool for discovering statistically significant phosphorylation motifs. Source codes and data sets are available at: http://bioinformatics.ust.hk/MotifAll.rar

    Sorting Signals, N-Terminal Modifications and Abundance of the Chloroplast Proteome

    Get PDF
    Characterization of the chloroplast proteome is needed to understand the essential contribution of the chloroplast to plant growth and development. Here we present a large scale analysis by nanoLC-Q-TOF and nanoLC-LTQ-Orbitrap mass spectrometry (MS) of ten independent chloroplast preparations from Arabidopsis thaliana which unambiguously identified 1325 proteins. Novel proteins include various kinases and putative nucleotide binding proteins. Based on repeated and independent MS based protein identifications requiring multiple matched peptide sequences, as well as literature, 916 nuclear-encoded proteins were assigned with high confidence to the plastid, of which 86% had a predicted chloroplast transit peptide (cTP). The protein abundance of soluble stromal proteins was calculated from normalized spectral counts from LTQ-Obitrap analysis and was found to cover four orders of magnitude. Comparison to gel-based quantification demonstrates that ‘spectral counting’ can provide large scale protein quantification for Arabidopsis. This quantitative information was used to determine possible biases for protein targeting prediction by TargetP and also to understand the significance of protein contaminants. The abundance data for 550 stromal proteins was used to understand abundance of metabolic pathways and chloroplast processes. We highlight the abundance of 48 stromal proteins involved in post-translational proteome homeostasis (including aminopeptidases, proteases, deformylases, chaperones, protein sorting components) and discuss the biological implications. N-terminal modifications were identified for a subset of nuclear- and chloroplast-encoded proteins and a novel N-terminal acetylation motif was discovered. Analysis of cTPs and their cleavage sites of Arabidopsis chloroplast proteins, as well as their predicted rice homologues, identified new species-dependent features, which will facilitate improved subcellular localization prediction. No evidence was found for suggested targeting via the secretory system. This study provides the most comprehensive chloroplast proteome analysis to date and an expanded Plant Proteome Database (PPDB) in which all MS data are projected on identified gene models

    Transcriptional Regulation of Ribosome Components Are Determined by Stress According to Cellular Compartments in Arabidopsis thaliana

    Get PDF
    Plants have to coordinate eukaryotic ribosomes (cytoribosomes) and prokaryotic ribosomes (plastoribosomes and mitoribosomes) production to balance cellular protein synthesis in response to environmental variations. We identified 429 genes encoding potential ribosomal proteins (RP) in Arabidopsis thaliana. Because cytoribosome proteins are encoded by small nuclear gene families, plastid RP by nuclear and plastid genes and mitochondrial RP by nuclear and mitochondrial genes, several transcriptional pathways were attempted to control ribosome amounts. Examining two independent genomic expression datasets, we found two groups of RP genes showing very different and specific expression patterns in response to environmental stress. The first group represents the nuclear genes coding for plastid RP whereas the second group is composed of a subset of cytoribosome genes coding for RP isoforms. By contrast, the other cytoribosome genes and mitochondrial RP genes show less constraint in their response to stress conditions. The two subsets of cytoribosome genes code for different RP isoforms. During stress, the response of the intensively regulated subset leads to dramatic variation in ribosome diversity. Most of RP genes have same promoter structure with two motifs at conserved positions. The stress-response of the nuclear genes coding plastid RP is related with the absence of an interstitial telomere motif known as telo box in their promoters. We proposed a model for the “ribosome code” that influences the ribosome biogenesis by three main transcriptional pathways. The first pathway controls the basal program of cytoribosome and mitoribosome biogenesis. The second pathway involves a subset of cytoRP genes that are co-regulated under stress condition. The third independent pathway is devoted to the control of plastoribosome biosynthesis by regulating both nuclear and plastid genes

    Integrative Identification of Arabidopsis Mitochondrial Proteome and Its Function Exploitation through Protein Interaction Network

    Get PDF
    Mitochondria are major players on the production of energy, and host several key reactions involved in basic metabolism and biosynthesis of essential molecules. Currently, the majority of nucleus-encoded mitochondrial proteins are unknown even for model plant Arabidopsis. We reported a computational framework for predicting Arabidopsis mitochondrial proteins based on a probabilistic model, called Naive Bayesian Network, which integrates disparate genomic data generated from eight bioinformatics tools, multiple orthologous mappings, protein domain properties and co-expression patterns using 1,027 microarray profiles. Through this approach, we predicted 2,311 candidate mitochondrial proteins with 84.67% accuracy and 2.53% FPR performances. Together with those experimental confirmed proteins, 2,585 mitochondria proteins (named CoreMitoP) were identified, we explored those proteins with unknown functions based on protein-protein interaction network (PIN) and annotated novel functions for 26.65% CoreMitoP proteins. Moreover, we found newly predicted mitochondrial proteins embedded in particular subnetworks of the PIN, mainly functioning in response to diverse environmental stresses, like salt, draught, cold, and wound etc. Candidate mitochondrial proteins involved in those physiological acitivites provide useful targets for further investigation. Assigned functions also provide comprehensive information for Arabidopsis mitochondrial proteome

    An “Electronic Fluorescent Pictograph” Browser for Exploring and Analyzing Large-Scale Biological Data Sets

    Get PDF
    Background. The exploration of microarray data and data from other high-throughput projects for hypothesis generation has become a vital aspect of post-genomic research. For the non-bioinformatics specialist, however, many of the currently available tools provide overwhelming amounts of data that are presented in a non-intuitive way. Methodology/Principal Findings. In order to facilitate the interpretation and analysis of microarray data and data from other large-scale data sets, we have developed a tool, which we have dubbed the electronic Fluorescent Pictograph – or eFP – Browser, available a

    Hydrogen Peroxide Acts on Sensitive Mitochondrial Proteins to Induce Death of a Fungal Pathogen Revealed by Proteomic Analysis

    Get PDF
    How the host cells of plants and animals protect themselves against fungal invasion is a biologically interesting and economically important problem. Here we investigate the mechanistic process that leads to death of Penicillium expansum, a widespread phytopathogenic fungus, by identifying the cellular compounds affected by hydrogen peroxide (H2O2) that is frequently produced as a response of the host cells. We show that plasma membrane damage was not the main reason for H2O2-induced death of the fungal pathogen. Proteomic analysis of the changes of total cellular proteins in P. expansum showed that a large proportion of the differentially expressed proteins appeared to be of mitochondrial origin, implying that mitochondria may be involved in this process. We then performed mitochondrial sub-proteomic analysis to seek the H2O2-sensitive proteins in P. expansum. A set of mitochondrial proteins were identified, including respiratory chain complexes I and III, F1F0 ATP synthase, and mitochondrial phosphate carrier protein. The functions of several proteins were further investigated to determine their effects on the H2O2-induced fungal death. Through fluorescent co-localization and the use of specific inhibitor, we provide evidence that complex III of the mitochondrial respiratory chain contributes to ROS generation in fungal mitochondria under H2O2 stress. The undesirable accumulation of ROS caused oxidative damage of mitochondrial proteins and led to the collapse of mitochondrial membrane potential. Meanwhile, we demonstrate that ATP synthase is involved in the response of fungal pathogen to oxidative stress, because inhibition of ATP synthase by oligomycin decreases survival. Our data suggest that mitochondrial impairment due to functional alteration of oxidative stress-sensitive proteins is associated with fungal death caused by H2O2

    Rice_Phospho 1.0: a new rice-specific SVM predictor for protein phosphorylation sites

    Get PDF
    Experimentally-determined or computationally-predicted protein phosphorylation sites for distinctive species are becoming increasingly common. In this paper, we compare the predictive performance of a novel classification algorithm with different encoding schemes to develop a rice-specific protein phosphorylation site predictor. Our results imply that the combination of Amino acid occurrence Frequency with Composition of K-Spaced Amino Acid Pairs (AF-CKSAAP) provides the best description of relevant sequence features that surround a phosphorylation site. A support vector machine (SVM) using AF-CKSAAP achieves the best performance in classifying rice protein phophorylation sites when compared to the other algorithms. We have used SVM with AF-CKSAAP to construct a rice-specific protein phosphorylation sites predictor, Rice-Phospho 1.0 (http://bioinformatics.fafu.edu.cn/rice-phospho1.0). We measure the Accuracy (ACC) and Matthews Correlation Coefficient (MCC) of Rice-Phospho 1.0 to be 82.0% and 0.64, significantly higher than those measures for other predictors such as Scansite, Musite, PlantPhos and PhosphoRice. Rice-Phospho 1.0 also successfully predicted the experimentally identified phosphorylation sites in LOC-Os03g51600.1, a protein sequence which did not appear in the training dataset. In summary, Rice-phospho 1.0 outputs reliable predictions of protein phosphorylation sites in rice, and will serve as a useful tool to the community

    PlantPhos: using maximal dependence decomposition to identify plant phosphorylation sites with substrate site specificity

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Protein phosphorylation catalyzed by kinases plays crucial regulatory roles in intracellular signal transduction. Due to the difficulty in performing high-throughput mass spectrometry-based experiment, there is a desire to predict phosphorylation sites using computational methods. However, previous studies regarding <it>in silico </it>prediction of plant phosphorylation sites lack the consideration of kinase-specific phosphorylation data. Thus, we are motivated to propose a new method that investigates different substrate specificities in plant phosphorylation sites.</p> <p>Results</p> <p>Experimentally verified phosphorylation data were extracted from TAIR9-a protein database containing 3006 phosphorylation data from the plant species <it>Arabidopsis thaliana</it>. In an attempt to investigate the various substrate motifs in plant phosphorylation, maximal dependence decomposition (MDD) is employed to cluster a large set of phosphorylation data into subgroups containing significantly conserved motifs. Profile hidden Markov model (HMM) is then applied to learn a predictive model for each subgroup. Cross-validation evaluation on the MDD-clustered HMMs yields an average accuracy of 82.4% for serine, 78.6% for threonine, and 89.0% for tyrosine models. Moreover, independent test results using <it>Arabidopsis thaliana </it>phosphorylation data from UniProtKB/Swiss-Prot show that the proposed models are able to correctly predict 81.4% phosphoserine, 77.1% phosphothreonine, and 83.7% phosphotyrosine sites. Interestingly, several MDD-clustered subgroups are observed to have similar amino acid conservation with the substrate motifs of well-known kinases from Phospho.ELM-a database containing kinase-specific phosphorylation data from multiple organisms.</p> <p>Conclusions</p> <p>This work presents a novel method for identifying plant phosphorylation sites with various substrate motifs. Based on cross-validation and independent testing, results show that the MDD-clustered models outperform models trained without using MDD. The proposed method has been implemented as a web-based plant phosphorylation prediction tool, PlantPhos <url>http://csb.cse.yzu.edu.tw/PlantPhos/</url>. Additionally, two case studies have been demonstrated to further evaluate the effectiveness of PlantPhos.</p

    'Unite and conquer': enhanced prediction of protein subcellular localization by integrating multiple specialized tools

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Knowing the subcellular location of proteins provides clues to their function as well as the interconnectivity of biological processes. Dozens of tools are available for predicting protein location in the eukaryotic cell. Each tool performs well on certain data sets, but their predictions often disagree for a given protein. Since the individual tools each have particular strengths, we set out to integrate them in a way that optimally exploits their potential. The method we present here is applicable to various subcellular locations, but tailored for predicting whether or not a protein is localized in mitochondria. Knowledge of the mitochondrial proteome is relevant to understanding the role of this organelle in global cellular processes.</p> <p>Results</p> <p>In order to develop a method for enhanced prediction of subcellular localization, we integrated the outputs of available localization prediction tools by several strategies, and tested the performance of each strategy with known mitochondrial proteins. The accuracy obtained (up to 92%) surpasses by far the individual tools. The method of integration proved crucial to the performance. For the prediction of mitochondrion-located proteins, integration via a two-layer decision tree clearly outperforms simpler methods, as it allows emphasis of biologically relevant features such as the mitochondrial targeting peptide and transmembrane domains.</p> <p>Conclusion</p> <p>We developed an approach that enhances the prediction accuracy of mitochondrial proteins by uniting the strength of specialized tools. The combination of machine-learning based integration with biological expert knowledge leads to improved performance. This approach also alleviates the conundrum of how to choose between conflicting predictions. Our approach is easy to implement, and applicable to predicting subcellular locations other than mitochondria, as well as other biological features. For a trial of our approach, we provide a webservice for mitochondrial protein prediction (named YimLOC), which can be accessed through the AnaBench suite at http://anabench.bcm.umontreal.ca/anabench/. The source code is provided in the Additional File <supplr sid="S2">2</supplr>.</p> <suppl id="S2"> <title> <p>Additional file 2</p> </title> <text> <p>This file contains scripts for the online server YimLOC. Please note that there scripts only codes for the ready-to-use STACK-mem-DT described in the main text. The scripts do not provide the training process.</p> </text> <file name="1471-2105-8-420-S2.pdf"> <p>Click here for file</p> </file> </suppl

    Variation in Molybdenum Content Across Broadly Distributed Populations of Arabidopsis thaliana Is Controlled by a Mitochondrial Molybdenum Transporter (MOT1)

    Get PDF
    Molybdenum (Mo) is an essential micronutrient for plants, serving as a cofactor for enzymes involved in nitrate assimilation, sulfite detoxification, abscisic acid biosynthesis, and purine degradation. Here we show that natural variation in shoot Mo content across 92 Arabidopsis thaliana accessions is controlled by variation in a mitochondrially localized transporter (Molybdenum Transporter 1 - MOT1) that belongs to the sulfate transporter superfamily. A deletion in the MOT1 promoter is strongly associated with low shoot Mo, occurring in seven of the accessions with the lowest shoot content of Mo. Consistent with the low Mo phenotype, MOT1 expression in low Mo accessions is reduced. Reciprocal grafting experiments demonstrate that the roots of Ler-0 are responsible for the low Mo accumulation in shoot, and GUS localization demonstrates that MOT1 is expressed strongly in the roots. MOT1 contains an N-terminal mitochondrial targeting sequence and expression of MOT1 tagged with GFP in protoplasts and transgenic plants, establishing the mitochondrial localization of this protein. Furthermore, expression of MOT1 specifically enhances Mo accumulation in yeast by 5-fold, consistent with MOT1 functioning as a molybdate transporter. This work provides the first molecular insight into the processes that regulate Mo accumulation in plants and shows that novel loci can be detected by association mapping
    corecore